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ABSTRACT 



This thesis presents a statistical analysis of the 
monthly rainfall for the Monterey Peninsula and the Carmel 
Valley in Central California. The analysis begins with the 
simple first-order autoregressive Markov model, which is 
found to be weak. Next, 2x2 contingency tables are used 
to identify predictors, one of which is found to be January 
rainfall. Finally, logistic analysis is used to quantify 
the predictive ability of January. 

This paper attempts to analyze rainfall time series 
in the statistical sense. No attempt is made to provide 
a physical explanation of the findings from the point of 
view of a meteorologist. 



4 



TABLE OF CONTENTS 



I. INTRODUCTION 20 

A. THE PROBLEM 20 

B. NOTATION 20 

C. METHODS OF ANALYSIS 21 

II. THE DATA 24 

A. GENERAL 24 

B. DATA SET ' RN ' 26 

C. DATA SET 'FL' 40 

D. DATA SET ’SC’ 5 2 

III. FIRST ORDER MARKOV MODEL 65 

A. THEORY 65 

B. DATA SET * RN ' 6 7 

C. DATA SET 'FL' 82 

D. DATA SET ’SC' 9 7 

IV. VALIDATION OF FIRST ORDER MARKOV MODEL 114 

A. THEORY 114 

B. DATA SET ' RN ' 115 

C. DATA SET ' FL * 121 

D. DATA SET 'SC' 12 6 

E. CONCLUSIONS 130 

V. 2x2 TABLES 131 

A. THEORY 131 

B. ANALYSIS 137 

C. OTHER RESULTS 141 



5 



VI. LOGISTIC ANALYSIS 146 

A. THEORY 146 

B. ANALYSIS 150 

C. DISCUSSION 161 

VII. VALIDATION OF LOGISTIC MODELS 169 

A. GENERAL 169 

B. RESULTS 172 

C. DISCUSSION 176 

VIII . FURTHER FINDINGS 177 

A. SUMMER MONTHS 177 

B. SIGNIFICANCE OF JANUARY 181 

IX. SUMMARY 185 

APPENDIX A: DATA SET RN 186 

APPENDIX B: DATA SET FL 191 

APPENDIX C: DATA SET SC 198 

LIST OF REFERENCES 205 

INITIAL DISTRIBUTION LIST 207 



6 



LIST OF TABLES 



1. ESTIMATED AUTOCORRELATIONS OF YEARLY TOTAL RAINFALL 

FOR DATA SET RN 2 9 

2. MONTHLY MEANS AND VARIANCE FOR DATA SET RN 31 

3. ESTIMATED AUTOCORRELATIONS OF MONTHLY RAINFALL 

ANOMALIES FOR DATA SET RN 34 

4. MONTHLY MEANS AND VARIANCE FOR LOGGED DATA SET RN 36 

5. ESTIMATED AUTOCORRELATIONS OF LOGGED ANOMALIES 

OF MONTHLY RAINFALL FROM DATA SET RN 40 

6. ESTIMATED AUTOCORRELATIONS OF YEARLY TOTAL RAINFALL 

FOR DATA SET FL 4 3 

7. MONTHLY MEANS AND VARIANCE FOR DATA SET FL 4 4 

8. ESTIMATED AUTOCORRELATIONS OF MONTHLY RAINFALL 

ANOMALIES FOR DATA SET FL 4 7 

9. MONTHLY MEANS AND VARIANCE FOR LOGGED DATA SET FL 48 

10. ESTIMATED AUTOCORRELATIONS OF LOGGED ANOMALIES FROM 

MONTHLY RAINFALL OF DATA SET FL 51 

11. ESTIMATED AUTOCORRELATIONS OF YEARLY TOTAL RAINFALL 

FOR DATA SET SC 5 5 

12. MONTHL MEANS AND VARIANCE FOR DATA SET SC 56 

13. ESTIMATED AUTOCORRELATIONS OF MONTHLY RAINFALL 

ANOMALIES FOR DATA SET SC 59 

14. MONTHLY MEANS AND VARIANCE OF LOGGED DATA SET SC 60 

15. ESTIMATED AUTOCORRELATIONS OF LOGGED ANOMALIES OF 

MONTHLY RAINFALL* FROM DATA SET SC 6 4 

16 . ESTIMATED PARTIAL -AUTOCORRELATIONS FOR LOGGED 

RAINFALL ANOMALIES OF DATA SET RN 6 8 

17. ACTUAL AND EXPECTED NUMBER OF TURNING POINTS AND 

ACTUAL AND EXPECTED PHASE FREQUENCIES FOR THE FIRST 
ORDER MARKOV RESIDUALS FROM THE LOGGED RAINFALL 
ANOMALIES OF DATA SET RN 72 



7 



18. GENERAL STATISTICS OF FIRST ORDER MARKOV RESIDUALS 

FROM LOGGED RAINFALL ANOMALIES OF DATA SET RN 73 

19. ACTUAL AND EXPECTED NUMBER OF TURNING POINTS AND 

ACTUAL AND EXPECTED PHASE FREQUENCIES FOR THE 
FIRST ORDER MARKOV RESIDUALS OF THE LOGGED RAINFALL 
FOR WINTER MONTHLY ONLY, DATA SET RN 80 

20. GENERAL STATISTICS OF FIRST ORDER MARKOV RESIDUALS 
FROM LOGGED RAINFALL ANOMALIES OF WINTER MONTHS, 

DATA SET RN 81 

21. ESTIMATED PARTIAL AUTOCORRELATIONS FOR LOGGED 

RAINFALL ANOMALIES OF DATA SET FL 8 3 

22. ACTUAL AND EXPECTED NUMBER OF TURNING POINTS AND 
ACTUAL AND EXPECTED PHASE FREQUENCIES FROM THE 

FIRST ORDER MARKOV RESIDUALS FROM DATA SET FL 87 

23. GENERAL STATISTICS OF FIRST ORDER MARKOV RESIDUALS 

FROM LOGGED RAINFALL ANOMALIES OF DATA SET FL 87 

24. ACTUAL AND EXPECTED NUMBER OF TURNING POINTS AND 
ACTUAL AND EXPECTED PHASE FREQUENCIES FROM THE 
FIRST ORDER MARKOV RESIDUALS OF WINTER MONTHLY 

ONLY, DATA SET FL 9 5 

25. GENERAL STATISTICS OF FIRST ORDER MARKOV RESIDUALS 
FROM LOGGED RAINFALL ANOMALIES OF WINTER MONTHS 

ONLY, DATA SET FL 95 

26. ESTIMATED PARTIAL AUTOCORRELATIONS FOR LOGGED 

RAINFALL ANOMALIES OF DATA SET SC 9 8 

27. ACTUAL AND EXPECTED NUMBER OF TURNING POINTS AND 
ACTUAL AND EXPECTED PHASE FREQUENCIES FROM THE 

FIRST ORDER MARKOV RESIDUALS OF DATA SET SC 102 

28. GENERAL STATISTICS OF FIRST ORDER MARKOV RESIDUALS 

FROM LOGGED RAINFALL ANOMALIES OF DATA SET SC 102 

29. ACTUAL AND EXPECTED NUMBER OF TURNING POINTS AND 

ACTUAL AND EXPECTED PHASE FREQUENCIES FROM THE 
FIRST ORDER MARKOV RESIDUALS OF THE LOGGED RAINFALL 
ANOMALIES OF THE WINTER MONTHS ONLY, DATA SET SC 111 

30. GENERAL STATISTICS OF FIRST ORDER MARKOV RESIDUALS 
FROM LOGGED RAINFALL ANOMALIES OF WINTER MONTHS 

ONLY, DATA SET SC 112 



8 



31. ACTUAL AND EXPECTED NUMBER OF TURNING POINTS AND 
ACTUAL AND EXPECTED PHASE FREQUENCIES FOR THE 
FORECAST ERRORS OF THE FIRST ORDER MARKOV MODEL 
APPLIED TO THE WINTER MONTHS OF RESERVED DATA 

SET RN 118 

32. GENERAL STATISTICS OF FORECAST ERRORS FROM THE FIRST 

ORDER MARKOV MODEL APPLIED TO THE WINTER MONTHS OF 
RESERVED DATA SET RN 119 

33. ACTUAL AND EXPECTED NUMBER OF TURNING POINTS AND 
ACTUAL AND EXPECTED PHASE FREQUENCIES FOR THE 
FORECAST ERRORS OF THE FIRST ORDER MARKOV MODEL 
APPLIED TO THE WINTER MONTHS OF RESERVED DATA 

SET FL 12 3 

34. GENERAL STATISTICS OF FORECAST ERRORS FROM THE 
FIRST ORDER MARKOV MODEL APPLIED TO THE WINTER 

MONTHS OF RESERVED DATA SET FL 12 4 

35. ACTUAL AND EXPECTED NUMBER OF TURNING POINTS AND 
ACTUAL AND EXPECTED PHASE FREQUENCIES FOR THE 
FORECAST ERRORS OF THE FIRST ORDER MARKOV MODEL 
APPLIED TO THE WINTER MONTHS ONLY OF RESERVED 

DATA SET SC 12 8 

36. GENERAL STATISTICS OF FORECAST ERRORS FROM THE 
FIRST ORDER MARKOV MODEL APPLIED TO THE WINTER 

MONTHS OF RESERVED DATA SET SC 12 8 

37. SIGNIFICANCE OF OBSERVED DEPARTURES FROM THE 

INDEPENDENCE OF SINGLE MONTH CONTROLS VERSUS 
SUCCEEDING ELEVEN MONTH COMPLEMENTS 138 

38. SIGNIFICANCE OF OBSERVED DEPARTURES FROM INDEPENDENCE 

OF PAIRED MONTHS CONTROLS VERSUS SUCCEEDING TEN MONTH 
COMPLEMENTS 139 

39. SIGNIFICANCE OF OBSERVED DEPARTURES FROM INDEPENDENCE 

OF TRIPLES OF MONTHS CONTROLS VERSUS SUCCEEDING NINE 
MONTH COMPLEMENTS 139 

40. SIGNIFICANCE OF OBSERVED DEPARTURES FROM INDEPENDENCE 

OF FOUR-TUPLES OF MONTHS CONTROLS VERSUS SUCCEEDING 
EIGHT MONTH COMPLEMENTS 140 

41. ODDS RATIO OF JANUARY VERSUS FEBRUARY THROUGH DECEMBER 

AND JANUARY PLUS DECEMBER VERSUS FEBRUARY THROUGH 
NOVEMBER 141 

42. ANOVA FOR REGRESSION OF SIMPLE LINEAR MODEL FOR 

ALL DATA SETS 142 



9 



43. ANOVA FOR REGRESSION OF SIMPLE LINEAR MODEL WITH 

MEANS REMOVED FOR ALL DATA SETS 144 

44. DATA SET RN, LOGGED JANUARY ANOMALIES AND SUCCESSES 

FOR GROUPED AND UN GROUPED FORMS 151 

45. DATA SET FL, LOGGED JANUARY ANOMALIES AND SUCCESSES 

FOR GROUPED AND UNGROUPED FORMS 152 

46. DATA SET SC, LOGGED JANUARY ANOMALIES AND SUCCESSES 

FOR GROUPED AND UNGROUPED FORMS 15 3 

47. ORDINARY LEAST SQUARES REGRESSION WITH THE MODEL 

OF EQUATION VI . 7 154 

48. ITERATIVELY REWEIGHTED LEAST SQUARES REGRESSION 

USING BIWEIGHTS FOR THE MODEL OF EQUATION VI . 7 158 

49. MAXIMUM LIKELIHOOD ESTIMATES OF a AND 8 ALONG 
WITH ESTIMATES OF THEIR VARIANCE FOR ALL THREE 

DATA SETS 15 9 

50. PARAMETER FIT RECAPITULATION FOR ALL DATA SETS 163 

51. DATA USED FOR MODEL FITS OF LOGISTIC MODELS 167 

52. RESERVED DATA IN FORM FOR THE LOGISTIC ANALYSIS 170 

53. RESULTS OF LOGISTIC VALIDATION ON DATA SET RN 172 

54. RESULTS OF LOGISTIC VALIDATION ON DATA SET FL 174 

55. RESULTS OF VALIDATION ON DATA SET SC 175 



10 



LIST OF FIGURES 



1. Location of rainfall data sets and the years 

available 2 5 

2. Monthly rainfall in inches for data set RN 27 

3. Yearly total rainfall for data set RN 2 8 

(1951-1974) 

4. Correlogram of yearly total rainfall for 

data set RN 29 

5. Lag one plot of yearly rainfall data for data 

set RN 30 

6. Monthly means for data set RN 32 

7. Monthly rainfall anomalies in inches for 

data set RN 33 

8. Correlogram of the month rainfall anomalies 

for data set RN 34 

9. Plot of monthly variance against monthly 

means for data set RN 35 

10. Monthly means of logged data set RN 37 

11. Plot of monthly variance against monthly means 

for logged data set RN 37 

12. Logged anomalies of monthly rainfall for 

data set RN 38 

13. Correlogram of logged anomalies of monthly 

rainfall from data set RN 39 

14. Rainfall in inches, by month, of data set FL 41 

15. Yearly total rainfall for data set FL 42 

(1937 - 1974) 

16. Correlogram of yearly total rainfall for 

data set FL 4 3 

17. Monthly means for data set FL 4 4 

18. Monthly rainfall anomalies in inches 

for data set FL 45 

19. Correlogram of monthly rainfall anomalies 

for data set FL 46 



11 



20. Plot of monthly variance against monthly means 

for data set FL 4 7 

21. Monthly means of logged data set FL 48 

22. Plot of monthly variance against monthly means 

for logged data set FL 4 9 

23. Months of logged rainfall anomalies for data 

set FL 49 

24. Correlogram of logged anomalies of monthly 

rainfall from data set FL 51 

25. Monthly plot of rainfall in inches 

for data set SC 52 

26. Yearly total rainfall for data set SC 54 

27. Correlogram of yearly total rainfall for 

data set SC 55 

28. Monthly means for data set SC 56 

29. Rainfall anomalies in inches for data set SC 57 

30. Correlogram of monthly rainfall anomalies for 

data set SC 59 

31. Plot of monthly variance against monthly means 

for data set SC 6 0 

32. Monthly means of logged data set SC 61 

33. Plot of monthly variance against monthly means 

for logged data set SC 61 

34. Months of logged rainfall anomalies from data 

set SC 62 

35. Correlogram of logged anomalies of monthly 

rainfall from data set SC 64 

36. Partial correlogram of the logged rainfall 

anomalies of data set RN 6 7 

37. First order Markov residuals from logged rainfall 

anomalies of data set RN 6 9 

38. Autocorrelations of residuals from first order Markov 

process applied to the logged rainfall anomalies of 
data set RN "71 



12 



39. Lag one plot of first order Markov residuals from 

logged rainfall anomalies of data set RN 71 

4Q. First order Markov residuals versus lag one 
data point from logged rainfall anomalies of 
data set RN 72 

41. Standardized normal plot of first order Markov 
residuals from logged rainfall anomalies data 

set RN 7 3 

42. Histogram of first order Markov residuals from 

logged rainfall anomalies of data set RN 74 

43. Winter months only of logged rainfall anomalies 

of data set RN 75 

44. Correlogram of winter months only of logged 

rainfall anomalies from data set RN 76 

45. Partial autocorrelations of winter months only of 

logged rainfall anomalies from data set RN 76 

46. First order Markov residuals of logged rainfall 

anomalies for winter months only of data set RN 77 

47. Correlogram of first order Markov residuals of 
logged rainfall anomalies for winter months only 

of data set RN 79 

48. Lag one plot of first order Markov residuals from 
logged rainfall anomalies for winter months only, 

data set RN 79 

49. First order Markov residuals versus lag one data 
point from logged rainfall anomalies of winter 

months only, data set RN 8 0 

50. Standardized normal plot of first order Markov 
residuals from logged rainfall anomalies of 

winter months only from data set RN 81 

51. Histogram of first order Markov residuals from 
logged rainfall anomalies of winter months only, 

data set RN 82 

52. Partial correlogram of the logged rainfall 

anomalies for data set FL 83 

53. First order Markov residual from logged rainfall 

anomalies of data set FL 84 



13 



54. Autocorrelations of residuals from first order 

Markov process applied to the logged rainfall 
anomalies of data set FL 85 

55. .Lag one plot of first order Markov residuals from 

logged rainfall anomalies of data set FL 86 

56. First order Markov residuals versus lag one data 
points from logged rainfall anomalies of data 

set FL 86 

57. Standardized normal plot of first order Markov 
residuals from logged rainfall anomalies of 

data set FL 88 

58. Histogram of first order Markov residuals from 

logged rainfall anomalies of data set FL 88 

59. Winter months only of logged rainfall anomalies 

of data set FL 89 

60. Correlogram of winter months only, logged rainfall 

anomalies from data set FL 91 

61. Partial correlogram of winter months only, logged 

rainfall anomalies from data set FL 91 

62. First order Markov residuals of logged rainfall 

anomalies for winter months only, data set FL 92 

63. Correlogram of first order Markov residuals of 
logged rainfall anomalies from winter months only, 

data set FL 9 3 

64. Lag one plot of first order Markov residuals from 
logged rainfall anomalies of winter months only, 

data set FL 94 

65. First order Markov residuals versus lag one data 
point from logged rainfall anomalies of winter 

months only, data set FL 9 4 

66. Standardized normal plot of first order Markov 
residuals from logged rainfall anomalies of 

winter months only, data set FL 96 

67. Histogram of first order Markov residuals from 
logged rainfall anomalies of winter months only, 

data set FL 96 

68. Partial correlogram of the logged anomalies 

for data set SC 9 7 



14 



69. First order Markov residuals from logged rainfall 

anomalies of data set SC 9 8 

70. Autocorrelations of residuals from first order 

Markov process applied to the logged rainfall 
anomalies of data set SC 100 

71. Lag one plot of first order Markov residuals from 

logged rainfall anomalies of data set SC 101 

72. First order Markov residuals versus lag one data 
point from logged rainfall anomalies of data 

set SC 101 

73. Standardized normal plot of first order Markov 
residuals from logged rainfall anomalies of 

data set SC 10 3 

74. Histogram of first order Markov residuals from 

logged rainfall anomalies of data set SC 10 3 

75. Winter months only, logged rainfall anomalies of 

data set SC 104 

76. Correlogram of winter months only, logged rainfall 

anomalies from data set SC 106 

77. Partial correlogram of winter months only, logged 

rainfall anomalies from data set SC 10 7 

78. First order Markov residuals of logged rainfall 

anomalies, for winter months only, data set SC 108 

79. Correlogram of first order Markov residuals of 
logged rainfall anomalies from winter months only, 

data set SC 110 

80. Lag one plot of first order Markov residuals from 
logged rainfall anomalies of winter months only, 

data set SC 110 

81. First order Markov residuals versus lag one data 
point form logged rainfall anomalies of winter 

months only, data set SC 111 

82. Standardized normal plot of first order Markov 

residuals from logged rainfall anomalies of winter 
months only, data set SC 112 

83. Histogram of first order Markov residuals from 
logged rainfall anomalies of winter months only, 

data set SC 113 



15 



84. Reserved rainfall data for data set RN 115 

85. Logged rainfall anomalies of reserved 

data set RN 116 

86. Forecast errors from first order Markov model 

applied to winter months of reserved data set RN 117 

87. Correlogram of forecast errors from first order 
Markov model applied to winter months of reserved 

data set RN 117 

88. Standardized normal plot of forecast errors from 

the first order Markov model applied to the winter 
months of reserved data set RN 119 

89. Histogram of forecast errors from the first order 

Markov model applied to the winter months of 
reserved data set RN 120 

90. Reserved rainfall data for data set FL 121 

91. Logged rainfall anomalies of reserved data set FL 122 

92. Forecast errors from the. first order Markov model 
applied to the winter months of reserved data 

set FL 122 

93. Correlogram of forecast errors from first order 

Markov model applied to the winter months of 
reserved data set FL 12 3 

94. Standardized normal plot of forecast errors from 

the first order Markov model applied to the winter 
months of reserved data set FL 124 

95. Histogram of forecast errors from the first order 

Markov model applied to the winter months of 
reserved data set FL 125 

96. Reserved rainfall for data set SC 126 

97. Logged anomalies of reserved data set SC 126 

98. Forecast errors from first order Markov model 
applied to the winter months of reserved data 

set SC 127 

99. Correlogram of forecast errors from first order 

Markov model applied to the winter months of 
reserved data set SC 127 



16 



100. Standardized normal plot of forecast errors from 

the first order Markov model applied to the winter 
months of reserved data set SC 12 9 

101. Histogram of forecast errors from the first order 

Markov model applied to the winter months of 
reserved data set SC 129 

102. Typical 2x2 contingency Table 132 

103. Contours of log likelihood function for 

data set RN 160 

104. Contours of log likelihood function for 

data set FL 160 

105. Contours of log likelihood function for 

data set SC 161 

106. Estimated probability of greater-than-average 
total rest-of-year rainfall versus the anomaly 

of logged rainfall for January for data set RN 164 

107. Estimated probability of greater-than-average 
total rest-of-year rainfall versus the anomaly 

of logged rainfall for January for data set FL 165 

108. Estimated probability of greater-than-average 
total rest-of-year rainfall versus the anomaly 

of logged rainfall for January for data set SC 166 

109. 2x2 contingency Tables of reserved data controlled 

by the anomally of January rainfall. The complement 
is the anomaly of the rest of year rainfall 171 

110. Plot of versus complement anomalies for 

data set RN 17 3 

111. Plot of versus complement anomalies for data 

set FL 174 

112. Plot of versus complement anomalies for maximum 

likelihood and IRWLS parameters for data set SC 176 

113. Monthly plot of summer months only, means removed-, 

data set RN 17 8 

114. Monthly plot of summer months only, means removed, 

data set FL 178 

115. Monthly plot of summer months only, means removed, 

data set SC 179 



17 



179 



116. Yearly plot of total summer rainfall for 

data set RN 

117. Yearly plot of total summer rainfall for 

data set FL 180 

118. Yearly plot of total summer rainfall for 

data set SC 180 

119. Log-odds and significance versus additional 

months cumulated through the year 18 3 

120. Log-odds and significance versus additional 

months of the fall 184 



18 



ACKNOWLEDGEMENT 



I would like to thank my advisors. Professors 
D. P. Gaver and P. A. Jacobs for their guidance. My 
second reader. Professor R. J. Renard has made sure 
that no laws of meteorology have been destroyed, for 
which I am grateful. 

Other who assisted me should also be mentioned, 
especially Professor A. L. Schoenstadt who allowed me 
access to the HP 9845B computer which drew all the 
figures. The personnel at the Monterey Peninsual water 
Management District, Dr. J. Williams, Mr. B. Buel, and 
Mr. K. Walsh, were also very helpful. 



19 



I. 



INTRODUCTION 



A. THE PROBLEM 

The Monterey Peninsula Water Management District, in 
Central California coastal area has as one of its responsi- 
bilities the duty to recommend and/or impose water rationing 
on its constituents. To do this in a rational way requires 
the District to have some formula for predicting future water 
availability. Although the techniques of modern meteorology 
are becoming more sophisticated and exact there is still the 
inability to make good long-range predictions. This thesis 
analyzes three series of Monterey County monthly rainfall 
data by purely statistical methodology in order to identify 
possible predictive formulas. 

B. NOTATION 

Rainfall will be denoted as R. which will represent 

t f m 

inches of rain recorded for the t^ year and the m^ month. 
The year to be used is the California Water Year which begins 
in October and ends the following September. Thus R^ ^ is 
the monthly rainfall for October of year, * 1 * and Rg g is 
the monthly rainfall of May of year '6'. 

An overstruck bar as in R. . will indicate the arithmetic 
average of a variable; in this case it is the arithmetic 
average of all years and months of rainfall. R. is the 
average of rainfall over the years for month m; R fc . 
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represents the yearly average for year t. 

C. METHODS OF ANALYSIS 

Three methods were used to analyze the data. The first 
method was to model the series using autoregressive moving 
averages as described in Box and Jenkins [Ref. 1], The 
second was to use 2x2 contingency tables to identify possible 
predictors. The third was logistic regression to quantify 
the findings of the 2x2 contingency table analysis. These 
three methods will be described in further sections of this 
paper . 



1. ARMA (p , q) Models 

A widely used approach to time series modeling pro- 
posed by Box and Jenkins is the ARMA(p,q) model. This model 
is actually a joining of two types of model, the autoregres- 
sive and the moving average. 

In the noation of Box and Jenkins: 
let {Z t , t=l,2,...,n} be a time series, then an ARMA(p,q) 
process may be written as: 



Z t " ^1 Z t-1 + •“ + <() p Z t-p + a t " 9 l a t-l **• 9 q a t-q 
the (a^_ , t=l,2,...,n} are assumed to be random shocks 



distributed as independent and identically distributed (iid) 



random variables with mean zero and variance o' 



and 



= z t ~ z ** Tiie f urt h- er assumption of normality is also 



usually made. 

For purposes of this paper, a mapping of R fc m 
into Z , r=12(t-l)+m was made, and an ARMA analysis was 
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then conducted on this index transformed series. This 



analysis is described in section III. 

2 . 2x2 Table Analysis 

In the validation of section IV it is found that 
the ARMA model is not very successful in describing the 
data. In section V the data is analyzed by means of 2x2 
contingency tables. These tables are good tools for explo- 
ratory data analysis in that they provide a visual display 
of the data. Statistical procedures based on the null 
hypothesis of independence can be used to quantify the 
departure from independence. The theory of 2x2 tables, 
and contingency tables in general may be found in Fleiss 
[Ref. 3], Dixon and Massey [Ref. 5], Brownlee [Ref. 6], 
and Mood, Graybill, and Boes [Ref. 7]. 

For this paper, the contingency table approach is 
used to identify a month or group of months of a year whose 
rainfall can serve as a predictor for the rainfall during 
the remaining months of the year. One predictor that was 
suggested is the rainfall in the month of January. 

3 . Logistic Analysis 

Once a predictor is tentatively identified it becomes 
necessary to quantify the degree, direction and accuracy of 
the predictor. 

A logistic analysis is conducted by dividing the data 
for a year into two sets, the predictor or control set, and 
predictand or complement set. For this analysis, the predictor 
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is the logged anomaly of January rainfall for the year; 
that is, if denotes the predictor or control for year 
t , then 



X 



t 



£n ( R t , 4 ) 



1 

N 



N 

l 



t=l 



en(R t,4 } 



1.2 



(The logarithm is used to better symmetrize the model.) The 
complement is the raw anomaly of the total rainfall for the 

I 

immediately subsequent eleven months; that is, if Y denotes 
the complement for year t, then; 



I 




12 3 

( y r. + y r. , . ) 

L c t,m L . t+l,m 

m=5 ' m=l 



I. 3 



1 

N-l 



N-l 

l 



t=l 



12 3 

( y r. + y r. ) 

L c t,m L . t+l,m 

m=5 ' m=l 



Finally , the data are transformed into a binary representa- 
tion, relative to zero as; 

( 0 if Y' < 0 

Y = \ 1.4 

^ ( 1 if Y' > 0 



In section VI the model fit is 



P (Y=l X=x) 



a+ft x 



1 + e 



a + B x 



1.5 



Where x is as before and P(.Y=l|X=x) is interpreted as: 

"the conditional probability that the subsequent eleven month 
total rainfall will be above its mean, given that the logged 
anomaly of January rainfall was 'x'". 
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II. THE DATA 



A. GENERAL 

Three data sets were used for this analyis. The location 
at which these data sets were gathered is shown in Figure 1. 

As the figure indicates, two of the data sets are on the 
Monterey Peninsula proper, while the third set, SC, represents 
the Carmel River Watershed at the San Clemente Dam. 

Although data exists in all cases to the present, all 
three sets were truncated at September of 1974. The remain- 
ing data, up through September of 1980 was reserved for 
validation of the models and methodology. 

The data coordinates are : 



Data 


set 


RN: 


36 0 


35 ' 


42" 


North Latitude 








121° 


54 ' 


43" 


West Longitude 


Data 


set 


FL: 


36° 


35 ' 


30" 


North Latitude 








121° 


56 ' 


30" 


West Longitude 


Data 


set 


SC: 


36° 


26 ' 


12" 


North Latitude 








121° 


42 ’ 


30" 


West Longitude 
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Figure 1. Location of rainfall data sets 
and the years available. 
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B. DATA SET RN 



Data set RN consists of monthly rainfall amounts gathered 
by Professor R.J. Renard, Cooperative Observer for the National 
Weather Service Climatological Station, Monterey, California. 
The data set begins in June 1951 and currently terminates in 
September 1980. As was stated above, the analysis was con- 
ducted only on that data between and including October 1951 
and September 1974. 

1. Raw Data 

Appendix A contains a listing of data set RN . Figure 
2 shows the raw data set. Month 1 is October 1951, month 148 
is January 1964, and up to month 288 which is September 1974. 

As can be seen the data are strongly seasonal. This is enough 
to indicate that the series, as stated, is highly non- 
stationary. 

The data presented so far deals with only monthly 
data. Next to be considered is the series of yearly total 
rainfalls. The results are shown in Figure 3 (Yearly total 
rainfall) , 4 (Correlogram of yearly rainfall) , and Table 1 
(Estimated Autocorrelations). In this case, the correlogram 
indicates stationarity and independence of the yearly series. 

A plot of the lag one relationships, Figure 5, reinforces 
this indication of independence. 
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Figure 2. Monthly rainfall in inches for data set RN 
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The correlograms and Partial Correlograms to follow 
indicate the 95% approximate significance levels using 
dashed lines. For development of these significance 
levels see Box and Jenkins [Ref. 1]. 




Figure 3. Yearly total rainfall for data set RN 
(1951 - 1974) . 
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Figure 4. Correlogram of Yearly total rainfall 
for data set RN. 



TABLE 1 

ESTIMATED AUTOCORRELATIONS OF YEARLY 
TOTAL RAINFALL FOR DATA SET RN 



AUTOCORRELATIONS 



LAG 


VALUE 


LAG 


VALUE 


1 


-.200 


7 


. 044 


2 


. 109 


8 


-.109 


3 


-.295 


9 


-.009 


4 


.186 


10 


-.139 


5 


-.249 


11 


.248 






12 


-.211 



29 



29.690 



Id d 

z - 

Id 2 



10.470 

10.470 



- 4 - 



LRGGED ELEMENT* 
W(t-1 ) 



x 

X 



29.630 



Figure 5. Lag one plot of yearly rainfall data 
for data set RN . 



2 . Swept Data 

Pierce [Ref. 9] and Hipel [Ref. 11] suggest various 
ways to remove the seasonality of data sets like RN , FL, and 
SC. The basic, and most straight forward of these methods 
is to remove the various monthly means. This is accomplished 
by the following replacement: 
let 



R. R, — R. 

t,m t,m m 

where R. represents the mean of the month m. 



II. 5 



One statistic that is a byproduct of the calculations 



of R. 



m 



is 



m 



defined as the estimated variance of the 
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monthly data points: 



S? = 
m 



T N 2 

r 7=t ( > R? - NR. ) 
N-l t,m m 



These statistics for data set RN are shown in Table 1 
illustrated in Figure 6. In the same way as the raw 
mapped into a series, a series is created from 



R. m as '* 

t f m 



Z = R. ,r=12 ( t-1) +m 
r t ,m 



TABLE 2 

MONTHLY MEANS AND VARIANCE FOR DATA SET RN 



MONTH 


MEAN 


VARIANCE 


1 


.677 


.5292 


2 


2.478 


3. 6354 


3 


3.355 


5.9147 


4 


4.124 


5.4146 


5 


2.592 


4.5122 


6 


2.639 


3.7414 


7 


1. 708 


2.7699 


8 


.434 


. 2494 


9 


.219 


.1004 


10 


.058 


.0056 


11 


. 104 


.0129 


12 


.275 


. 3953 



■a 



II. 6 

, and 
data 

II. 7 
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Figure 6. Monthly means for data set RN 
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Figure 7. Monthly rainfall anomalies in inches 
for data set RN 
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Figure 8. Correlogram of the monthly rainfall 
anomalies for data set RN 



TABLE 3 

ESTIMATED AUTOCORRELATIONS OF MONTHLY RAINFALL 
ANOMALIES FOR DATA SET RN 



LAG 


VALUE 


LAG 


VALUE 


1 


.249 


14 


-.057 


2 


-.090 


15 


-.109 


3 


-.059 


16 


-.006 


4 


.041 


17 


.067 


5 


.032 


18 


.004 


6 


-.035 


19 


.013 


7 


-.011 


20 


-.007 


8 


-.043 


21 


.063 


9 


-.073 


22 


. 023 


10 


-.091 


23 


- . 066 


11 


.076 


24 


-.044 


12 


-.020 


25 


.021 


13 


-.012 
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3 . Logged and Swept Data 



The data should now be stationary in the means. 
However, as seen in Table 2, the variances of monthly 
rainfall amounts are not homogeneous. Kilmartin [Ref. 10] 
discusses various transformations of the data to remove 
this heteroskedacity . A plot of the variance versus mean. 
Figure 9 below, indicates that the logarithmic transform 
of the data might be useful. 



6 « 000 f 



Ld 

U 



<1 



<r 

> 



MEANS VS VARIANCE RN 



3.000 



0.000 



MEfiN: 



6 . 000 



Figure 9. Plot of monthly variance against 
monthly means for data set RN 
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Since the data contain zeros, the following modified 
logarithmic transformation is done 



R' 



t ,m 



ln(R. m +1) 
t f m 



i 

N 



N 



I 

t=l 



ln(R +1) 
t , m 



II. 8 



where the effect of the addition of the one is mostly to 
preserve the mapping of zeros into zeros. A more in depth 
discussion of this transformation is found in Kilmartin. 
The mapping is performed again as before and R' m and 

S' are calculated in a manner similar to II. 6 and 

.m 

shown in Table 4 and Figures 10 and 11. 



TABLE 4 

MONTHLY MEANS AND VARIANCE FOR 
LOGGED DATA SET RN 



MONTH 


MEAN 


VARIANCE 


1 


.438 


.1549 


2 


1.092 


. 3454 


3 


1.312 


. 3538 


4 


1.539 


.2003 


5 


1.104 


. 3795 


6 


1.133 


. 3706 


7 


. 854 


.2728 


8 


. 319 


.0767 


9 


.176 


.0401 


10 


.054 


.0045 


11 


.094 


.0094 


12 


.185 


.0849 
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! . OOO r 



in 

if 



0 . 000 



1 . 000 



MEANS 



12.000 



MONTHS 



Figure 10. Monthly means of logged data set RN 
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Figure 11. Plot of monthly variance against monthly 
means for logged data set RN 
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These transformations, the logarithm followed by 
the removal of the monthly means of the logged data, result 
in the series listed in Appendix A and described in Figures 
12 and 13 with Table 5. 

These displays indicate that a suitably stationary 
series has been obtained. Other methods, such as differencing, 
scaling, and Box-Cox transformations, see Hipel [Ref. 11], 
were tried but with less success. 



DflTR 15 FROM r.Euf-ri 




Figure 12a. Logged anomalies of monthly 
rainfall for data set RN. Months 1 -148 
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Figure 12b. Logged anomalies of monthly 
rainfall for data set RN 
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Figure 13. Correlogram of logged anomalies of 
monthly rainfall from data set RN 
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TABLE 5 



ESTIMATED AUTOCORRELATIONS OF LOGGED 
ANOMALIES OF MONTHLY RAINFALL FROM 
DATA SET RN 



AUTOCORRELATIONS 



LAG 


VALUE 


LAG 


VALUE 


1 


.191 


14 


-.033 


2 


-.102 


15 


-.084 


3 


-.095 


16 


-.024 


4 


.071 


17 


.062 


5 


.056 


18 


-.015 


6 


-.053 


19 


-.008 


7 


-.009 


20 


.013 


8 


-.014 


21 


.013 


9 


-.022 


22 


-.038 


10 


-.069 


23 


-.024 


11 


.024 


24 


-.012 


12 


-.004 


25 


.032 


13 


.032 







C. DATA SET FL 

The label for these data derives from its location, 
Forest Lake, on the Monterey Peninsula, in Pebble Beach, 
California. Data set FL consists of monthly rainfall 
figures gathered by the California-American Water Company 
since 1896. Although this data set started quite early, 
the data prior to 1937 has frequent missing observations. 
Therefore, this data set is taken as October 1937 through 
September 1974, with October 1974 through September 1980 
reserved for validation. 

Analysis of this data set is identical to that of data 
set RN, therefore only the pertinent, figures and tables 
are shown . 
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Figure 14a. 



Months 1 - 296 of rainfall in 
inches for data set FL 
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Figure 14b. Months 297-444 of rainfall 
in inches for data set FL 



DATS IS FROri NEUFL 




Figure 15 . Yearly total rainfall for data set FL 
(1937 - 1974) . 
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Figure W. ““^ata set FL 



of yearly total raiu 



fall 



TABLE 6 

T v TOTAL rainfall 

[MATE D XOTOCOBB^IO* OF™*- 

autoco reflations 

VALUE 



lag 

1 

2 

3 

4 

5 

6 

7 

8 
9 

10 

11 

12 

13 



.010 

.099 

-.022 

.207 

-.214 

.028 

-.003 

-.061 

-.138 

-.060 

.236 

-.119 

-.151 



lag 


VALUE 


14 


-.028 

.105 


15 


-.087 


16 


.085 


17 


-.018 


18 


-.245 


19 


-.237 


20 


-.035 


21 


-.031 


22 


-.106 


23 


-.096 


24 


.120 


25 
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2 . Swept Data 



TABLE 7 

MONTHLY MEANS AND VARIANCE FOR DATA SET FL 



ONTH 


MEAN 


VARIANCE 


1 


. 744 


. 4895 


2 


2.235 


3.7444 


3 


3.049 


4.2480 


4 


3.537 


4.4240 


5 


2.999 


5.9492 


6 


2.724 


3. 1743 


7 


1.559 


2.4505 


8 


. 449 


.2033 


9 


. 15 3 


.0417 


10 


.077 


. 0081 


11 


.115 


.0081 


12 


. 189 


.1350 
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4 . COO r 




1 . 00 0 1 2 . 000 



MEANS 



MONTHS 

x 



Figure 17. Monthly means for data set FL 
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Figure 18a. Months 1 - 296 of rainfall 
in inches for data set FL 
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Figure 18b. 



Months 297-444 of rainfall anomalies 
in inches for data set FL 
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Figure 19. Correlogram of monthly rainfall 
anomalies for data set FL 
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TABLE 8 



ESTIMATED AUTOCORRELATIONS OF MONTHLY 
RAINFALL ANOMALIES FOR DATA SET FL 



AUTOCORRELATIONS 



LAG 


VALUE 


LAG 


VALUE 


1 


.244 


14 


-.053 


2 


-.007 


15 


-.043 


3 


-.056 


16 


-.002 


4 


. 027 


17 


.039 


5 


.016 


18 


-.014 


6 


-.026 


19 


-.011 


7 


-.022 


20 


-.031 


8 


-.021 


21 


.023 


9 


-.051 


22 


.067 


10 


-.020 


23 


-.007 


11 


.077 


24 


.041 


12 


.059 


25 


.039 


13 


. 068 







3 . Logged and Swept Data 
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Figure 20. Plot of monthly variance against monthly 
means for data set FL 
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TABLE 9 



MONTHLY MEANS AND VARIANCE FOR LOGGED DATA SET FL 



MONTH 


MEAN 


VARIANCE 


1 


. 484 


. 1440 


2 


1.007 


. 3440 


3 


1.269 


.2781 


4 


1.398 


. 2579 


5 


1.218 


. 3421 


6 


1.184 


. 3006 


7 


.796 


.2717 


8 


. 338 


. 0589 


9 


. 130 


.0234 


10 


.071 


.0061 


11 


.106 


. 0064 


12 


.146 


.0044 
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Figure 21. Monthly means of logged 
data set FL 
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LOGGED RAINFALL ANOMALIES 
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Figure 22. Plot of monthly variance against monthly 
means for logged data set FL 
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Figure 23a. Months 1 - 148 of logged rainfall 
anomalies for data set FL 
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Figure 23b. Months 149 - 444 of logged rainfall 
anomalies from data set FL 
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Figure 24. Correlogram of logged anomalies of monthly 
rainfall from data set FL. 

TABLE 10 



ESTIMATED AUTOCORRELATIONS OF LOGGED 
ANOMALIES FROM MONTHLY RAINFALL OF 
DATA SET FL 



AUTOCORRELATIONS 



LAG 


VALUE 


1 


. 185 


2 


-.020 


3 


-.081 


4 


.046 


5 


.043 


6 


-.050 


7 


-.024 


8 


-.010 


9 


-.027 


10 


.004 


11 


.031 


12 


.084 


13 


.074 



LAG 


VALUE 


14 


-.052 


15 


-.021 


16 


-.020 


17 


.040 


18 


-.040 


19 


-.025 


20 


-.014 


21 


.007 


22 


.019 


23 


-.015 


24 


.076 


25 


.047 
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D. DATA SET SC 



The label for this data derives for its location, San 
Clemente Dam, on the Carmel River in Central Califronia, 
approximately 26 kilometers southeast of data sets RN and 
FL on the Monterey Peninsula. Data set SC consists of 
monthly rainfall figures gathered by the California-American 
Water Company since 1926. 

Analysis of this data set is again very close to that 
of the previous data sets and only the displays will be 
given. 

1. Raw Data 



OflTfl 13 FROM NEHSC 

15. 160 - 



t 




MONTHS OF DATA 



Figure 25a. Months 1 (October 1926) - 148 
(January 1938) of rainfall in inches for 
data set SC. 
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Fiugre 25b. Months 149 - 444 of rainfall 
for data set SC 
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ORTA IS FROM NEliSC 




Figure 25c. Months 445 - 576 of rainfall for 
data set SC 
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YEARS, OF DATA 



Figure 26. Yearly total rainfall for data 
set SC (1926 - 1974) 
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AUTOCORRELATIONS 



COPRELOGRflri 




Figure 27. Correlogram of yearly total 
rainfall for data set SC 



TABLE 11 

ESTIMATED AUTOCORRELATIONS OF YEARLY 
TOTAL RAINFALL FOR DATA SET SC 

AUTOCORRELATIONS 



LAG 


VALUE 


1 


-.050 


2 


.135 


3 


-.158 


4 


.16 8 


5 


-.042 


6 


.081 


7 


-.084 


8 


-.217 


9 


-.111 


10 


-.107 


11 


.158 


12 


-.260 


13 


-.110 



LAG 


VALUE 


14 


-.109 


15 


.161 


16 


.077 


17 


.116 


18 


.025 


19 


-.214 


20 


.066 


21 


.019 


22 


-.028 


23 


-.101 


24 


-.030 


25 


-.042 
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2 . Swept Data 



TABLE 12 

MONTHLY MEANS AND VARIANCES 
FOR DATA SET SC 



MONTH 


MEAN 


VARIANCE 


1 


.698 


.5945 


2 


2.175 


4.2382 


3 


3.940 


10.1783 


4 


4.599 


8.4899 


5 


4.353 


13.3443 


6 


3.080 


5.2744 


7 


1.700 


3.4486 


8 


.431 


.1912 


9 


• 111 


. 0451 


10 


.017 


.0055 


11 


.037 


.0125 


12 


.103 


.1040 
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Figure 28. Monthly means for data set SC 
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Figure 29a. Months 1 - 296 anomalies in inches 
for data set SC 
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Figure 29b. Months 297 - 576 anomalies in 
inches for data set SC 
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Figure 30. Correlogram of monthly rainfall 
anomalies for data set SC 



TABLE 13 

ESTIMATED AUTOCORRELATIONS OF MONTHLY 
RAINFALL ANOMALIES FOR DATA SET SC 

AUTOCORRELATIONS 



LAG 


VALUE 


1 


.140 


2 


-.039 


3 


-.021 


4 


.012 


5 ’ 


-.001 


6 


-.019 


7 


.003 


8 


-.013 


9 


-.038 


10 


-.038 


11 


.014 


12 


.051 


13 


.102 



LAG 


VALUE 


14 


-.088 


15 


-.051 


16 


-.006 


17 


.015 


18 


-.006 


19 


-.008 


20 


-.027 


21 


-.003 


22 


-.122 


23 


-.006 


24 


.011 


25 


.011 
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3. Logged and Swept Data 
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Figure 31. 



Plot of monthly variance against anomally 
means for data set SC. 



TABLE 14 

MONTHLY MEANS AND VARIANCES OF 
LOGGED DATA SET SC 



MONTH 


MEAN 


VARIANCE 


1 


.444 


.1623 


2 


.961 


. 3999 


3 


1.408 


. 3949 


4 


1.583 


. 3117 


5 


1.444 


.4928 


6 


1.243 


. 3556 


7 


.817 


. 3247 


8 


. 320 


.0726 


9 


.092 


.0227 


10 


.015 


.0040 


11 


.031 


.0083 


12 


.075 


.0351 
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Figure 32. Monthly means of logged data set SC 
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Figure 33. Plot of monthly variance against monthly 
means for logged data set SC 
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DATA IS r ROM NE.JSC 
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Figure 34a. Months 1 — 296 of logged rainfall 
anomalies from data set SC 
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Figure 34b. Months 297 - 576 of logged rainfall 
anomalies from data set SC 
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Figure 35. Correlogram of logged anomalies of 
monthly rainfall from data set SC 



TABLE 15 

ESTIMATED AUTOCORRELATION OF LOGGED 
ANOMALIES OF MONTHLY RAINFALL FROM 
DATA SET SC 

AUTOCORRELATIONS 



LAG 


VALUE 


LAG 


VALUE 


1 


.096 


14 


-.057 


2 


-.065 


15 


-.022 


3 


-.066 


16 


-.023 


4 


.038 


17 


.021 


5 


.012 


18 


-.017 


6 


-.061 


19 


-.005 


7 


.023 


20 


-.008 


8 


-.001 


21 


-.011 


9 


-.056 


22 


.092 


10 


-.021 


23 


-.019 


11 


.007 


24 


.050 


12 


.091 


25 


.002 


13 


.091 
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III. FIRST ORDER MARKOV MODEL 



A. THEORY 

As first shown by equation 1.1, the general ARMA(p,q) 
model is: 




Z 



t-1 



+ . 





9 1 a t-l 




III. 1 



The development and discussion of this type of model is 
contained in detail in Box and Jenkin [Ref. 1] and Nelson 
[Ref. 8]. The modeling process is a three fold procedure. 
The parts are: 

(1) Identification 

(2) Estimation 

(3) Diagnosis. 

Identification is conducted using the correlogram and 
a plot of the partial-autocorrelations (or partial correlo- 
gram) . The partial autocorrelations are related to the 
autocorrelations, see Box and Jenkins [Ref. 1], Nelson 
[Ref. 8], or Richards and Woodall [Ref. 12]. These partial 
autocorrelations are used to determine the order of the 
moving average process much like the autocorrelations may 
be used to determine the order of the auto-regressive 
process . 

Once the autocorrelations and partial autocorrelations 
have been found, the degree of the ARMA may be estimated by 
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techniques described in Box and Jenkins, Nelson or Richards 
and Woodall. Each of the data sets, once logged and swept, 
indicated that the most probable model was an ARMA(1,0) or 
AR(1) or more commonly a first-order autoregressive Markov 
model. This model is simply; 

h = pi t-i + a t iix - 2 

where the p is the autocorrelation of lag one. Thus, this 
model indicates that any persistence in the data are condi- 
tionally independent of the past given the lag one value. 

Subsections B, C, and D below show this model as applied 
to the three data sets of interest. The residuals of the 
model - PZ t _^ are examined. The residuals appear to 

be independent, however, they do not appear to be normally 
distributed; for example, there is a high peak around zero. 

One possible reason for this discrepancy may be the dichotomy 
of winter and summer rain as indicated in Tables 2, 4, 7, 9, 
12, and 14. The existence of months with zero rainfall during 
the summer suggests that one should consider the summer, when 
rain is sparse, completely separate from the winter when rain 
is more abundant. Therefore, also shown in the subsections 
below is the autoregressive model applied to the winter months 
only. This is accomplished by stripping omt months 9 through 
12 (June through September) of the data sets and treating the 
remaining data as a continuous set. In other words, the 
first ten months are then 



R l,l' R l,2' R l, 3 ' R 1 , 4 ' R 1 , 5 ' 



R 1,6' R 1,8' R 2,1' R 2,2* 
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The appropriate correlograms and partial correlograms are 
displayed prior to the model applications. 

B. DATA SET RN 

1. Twelve Month Series 

This data set is described in section II. b. The 
remaining diagnostic device needed is the partial correlogram 
of Figure 36 and the corresponding values in Table 16. 



PRF TIfiL 
CORRELOGRAM 



CE 



QC 

CL 

CL 




Figure 36. Partial correlogram of the logged 
rainfall anomalies of data set RN 
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TABLE 16 



ESTIMATED PARTIAL-AUTOCORRELATIONS FOR 
LOGGED RAINFALL ANOMALIES OF DATA SET RN 



LAG 


VALUE 


LAG 


VALUE 


1 


.191 


14 


-.040 


2 


-. 144 


15 


-.073 


3 


-.047 


16 


.001 


4 


.092 


17 


. 054 


5 


.006 


18 


-.067 


6 


-.058 


19 


.039 


7 


.037 


20 


.012 


8 


-.034 


21 


-.001 


9 


-.028 


22 


-.043 


10 


-.056 


23 


.013 


11 


.048 


24 


-.043 


12 


-.041 


25 


-.034 


13 


.047 







The model of interest is then 



Z 



t 



. 191Z. , + a, 

t-1 t 



III. 3 



where the random shocks (a t > are assumed to be distributed 
iid N(0,a 2 ) and a 2 is estimated as 



1 

N-l 



N 



l 

t=l 




. 191Z ' 



t-1' 



III. 4 



The goodness of this fit may be viewed in two ways. 
Firstly, are the residuals, {a t ) independent? Secondly, 
are the residuals distributed as Normal (Gaussian) random 
variables? A plot of the residuals follows in Figure 37. 

The question of independence is addressed in Figure 38 
(Correlogram) , Figure 39 (Lag one plot) , Figure 40 (Residuals 
vs. lag one) , and Table 17 (Turning points) . For a discussion 
of the usefulness of the turning points see Kendall [Ref. 14], 
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All of these displays and tests tend to indicate that the 
residuals are in fact serially independent. The statistics 
of the residuals are in Table 18. A Normal Plot of the 
residuals (Figure 41) , in which the sample is normalized by 
removing the mean and scaling by the standard deviation and 
then plotted on normal paper, should yield a nearly straight 
line corresponding to the dashed line of the figure. The 
Normal Plot accompanied by the sample histogram (Figure 42) 
addresses the normality of these data. As may be seen from 
the kurtosis, the fluctuations of the sample CDF near the 
midpoint, and the peak of the histogram, the normality of 
this data are questionable. To confirm this a chi-squared 
goodness of fit test was conducted yielding a value of 49.18 
with 17 degrees of freedom, again rejecting any hypothesis 
of normality at a significance level of 5x10 . 




MONTHS OF DATA 



Figure 37. First order Markov residuals from logged 
rainfall anomalies of data set RN. Months 149 - 292 
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DATA IS FROM RNLGSH 




Figure 37b. First order Markov residuals from 
logged rainfall anomalies of data set RN. 
Months 149 - 292 



70 



CORRElOGF. riM 



1 .0 r 



u\ 

o 



0.5 



QC 

O' 

o 

u 

o 

H 

3 

CL 



0.3 






/ ,\ 

— r 



^ xy 



-0.5 - 



-i .0 L 



LAG 



Figure 38. Auto correlations of residuals from 
first order Markov process applied to the logged 
rainfall anomalies of data set RN 
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Figure 39. Lag one plot of first order Markov 
residuals from logged rainfall anomalies of 
data set RN 
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Figure 40. First order Markov residuals versus 
lag one data point from logged rainfall anomalies 
of data set RN 



TABLE 17 

ACTUAL AND EXPECTED NUMBER OF TURNING POINTS 
AND ACTUAL AND EXPECTED PHASE FREQUENCIES 
FOR THE FIRST ORDER MARKOV RESIDUALS FROM 
THE LOGGED RAINFALL ANOMALIES OF DATA SET RN 



NUMBER 


OF TURNING 


POINTS 


= 191 


E [P ] = 


190.667 


V[P] 


= 15.899 


PHASE 


LENGTHS 






D 


OBS 


• 


E [ * ] 


1 


117 




118. 8 


2 


56 




52.1 


3 


15 




14.9 


4 


3 




3.2 


5 


0 




.6 


6 


0 




.1 


7 


0 




0.0 
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TABLE 18 



GENERAL STATISTICS OF FIRST ORDER 
MARKOV RESIDUALS FROM LOGGED RAINFALL 
ANOMALIES OF DATA SET RN 



Moments 



Mean 

Variance 

Skewness 

Kurtosis 



-.001 

.117 

-.066 

.523 



Percentiles 



Minimum 


-1.141 


Lower Sixteenth 


- .745 


Lower Eight 


- .463 


Lower Quartile 


- .174 


Median 


- .014 


Upper Quartile 


.211 


Upper Eight 


.436 


Upper Sixteenth 


.706 


Maximum 


1.246 
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Figure 41. Standardized normal plot of first 
order Markov residuals from logged rainfall 
anamlies of data set RN 
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